91 research outputs found

    VUV photochemistry of the H2O center dot center dot center dot CO complex in noble-gas matrices : formation of the OH center dot center dot center dot CO complex and the HOCO radical

    Get PDF
    Vacuum ultraviolet (VUV, 130-170 nm) photochemistry of the H2O center dot center dot center dot CO complex is studied by matrix-isolation infrared spectroscopy. The H2O center dot center dot center dot CO complexes in Ne, Ar, Kr, and Xe matrices are generated by ultraviolet (UV, 193 and 250 nm) photolysis of formic acid (HCOOH). VUV photolysis of the H2O center dot center dot center dot CO complexes is found to lead to the formation of the OH center dot center dot center dot CO radical-molecule complexes and trans-HOCO radicals. It is shown that the matrix material, local matrix morphology, and possibly the H2O center dot center dot center dot CO complex geometry strongly affect the VUV photolysis pathways. The intrinsic reactivity of the matrix-isolated OH center dot center dot center dot CO complex resulting in the formation of trans-HOCO is directly demonstrated for the first time. This reaction occurs in Ar, Kr, and Xe matrices upon annealing above 25 K and may proceed over the barrier. The case of a Ne matrix is very special because the formation of trans-HOCO from the OH center dot center dot center dot CO complex is observed even at the lowest experimental temperature (4.5 K), which is in sharp contrast to the other matrices. It follows that quantum tunneling is probably involved in this process in the Ne matrix at such a low temperature. Infrared light also promotes this reaction in the Ne matrix at 4.5 K, which is not the case in the other matrices. The last findings show the effect of the environment on the tunneling and infrared-induced rates of this fundamental chemical reaction.Peer reviewe

    SciRepEval: A Multi-Format Benchmark for Scientific Document Representations

    Full text link
    Learned representations of scientific documents can serve as valuable input features for downstream tasks, without the need for further fine-tuning. However, existing benchmarks for evaluating these representations fail to capture the diversity of relevant tasks. In response, we introduce SciRepEval, the first comprehensive benchmark for training and evaluating scientific document representations. It includes 25 challenging and realistic tasks, 11 of which are new, across four formats: classification, regression, ranking and search. We then use the benchmark to study and improve the generalization ability of scientific document representation models. We show how state-of-the-art models struggle to generalize across task formats, and that simple multi-task training fails to improve them. However, a new approach that learns multiple embeddings per document, each tailored to a different format, can improve performance. We experiment with task-format-specific control codes and adapters in a multi-task setting and find that they outperform the existing single-embedding state-of-the-art by up to 1.5 points absolute.Comment: 21 pages, 2 figures, 9 tables. For associated code, see https://github.com/allenai/scirepeva

    ABNIRML: Analyzing the Behavior of Neural IR Models

    Get PDF
    Numerous studies have demonstrated the effectiveness of pretrained contextualized language models such as BERT and T5 for ad-hoc search. However, it is not well-understood why these methods are so effective, what makes some variants more effective than others, and what pitfalls they may have. We present a new comprehensive framework for Analyzing the Behavior of Neural IR ModeLs (ABNIRML), which includes new types of diagnostic tests that allow us to probe several characteristics---such as sensitivity to word order---that are not addressed by previous techniques. To demonstrate the value of the framework, we conduct an extensive empirical study that yields insights into the factors that contribute to the neural model's gains, and identify potential unintended biases the models exhibit. We find evidence that recent neural ranking models have fundamentally different characteristics from prior ranking models. For instance, these models can be highly influenced by altered document word order, sentence order and inflectional endings. They can also exhibit unexpected behaviors when additional content is added to documents, or when documents are expressed with different levels of fluency or formality. We find that these differences can depend on the architecture and not just the underlying language model

    Literature-Augmented Clinical Outcome Prediction

    Full text link
    We present BEEP (Biomedical Evidence-Enhanced Predictions), a novel approach for clinical outcome prediction that retrieves patient-specific medical literature and incorporates it into predictive models. Based on each individual patient's clinical notes, we train language models (LMs) to find relevant papers and fuse them with information from notes to predict outcomes such as in-hospital mortality. We develop methods to retrieve literature based on noisy, information-dense patient notes, and to augment existing outcome prediction models with retrieved papers in a manner that maximizes predictive accuracy. Our approach boosts predictive performance on three important clinical tasks in comparison to strong recent LM baselines, increasing F1 by up to 5 points and precision@Top-K by a large margin of over 25%.Comment: To appear in Findings of NAACL 2022. Code available at: https://github.com/allenai/BEE

    Spectroscopic characterization of the complex of vinyl radical and carbon dioxide : Matrix isolation and ab initio study

    Get PDF
    We report on the preparation and vibrational characterization of the C2H3 center dot center dot center dot CO2 complex, the first example of a stable intermolecular complex involving vinyl radicals. This complex was prepared in Ar and Kr matrices using UV photolysis of propiolic acid (HC3OOH) and subsequent thermal mobilization of H atoms. This preparation procedure provides vinyl radicals formed exclusively as a complex with CO2, without the presence of either CO2 or C2H3 monomers. The absorption bands corresponding to the nu(5)(C2H3), nu(7)(C2H3), nu(8)(C2H3), nu(2)(CO2), and nu(3)(CO2) modes of the C2H3 center dot center dot center dot CO2 complex were detected experimentally. The calculations at the UCCSD(T)/L2a level of theory predict two structures of the C2H3 center dot center dot center dot CO2 complex with C-s and C-1 symmetries and interaction energies of -1.92 and -5.19 kJ mol(-1). The harmonic vibrational frequencies of these structures were calculated at the same level of theory. The structural assignment of the experimental species is not straightforward because of rather small complexation-induced shifts and matrix-site splitting of the bands (for both complex and monomers). We conclude that the C-1 structure is the most probable candidate for the experimental C2H3 center dot center dot center dot O-2 complex based on the significant splitting of the bending vibration of CO2 and on the energetic and structural considerations. Published by AIP Publishing.Peer reviewe

    RCT Rejection Sampling for Causal Estimation Evaluation

    Full text link
    Confounding is a significant obstacle to unbiased estimation of causal effects from observational data. For settings with high-dimensional covariates -- such as text data, genomics, or the behavioral social sciences -- researchers have proposed methods to adjust for confounding by adapting machine learning methods to the goal of causal estimation. However, empirical evaluation of these adjustment methods has been challenging and limited. In this work, we build on a promising empirical evaluation strategy that simplifies evaluation design and uses real data: subsampling randomized controlled trials (RCTs) to create confounded observational datasets while using the average causal effects from the RCTs as ground-truth. We contribute a new sampling algorithm, which we call RCT rejection sampling, and provide theoretical guarantees that causal identification holds in the observational data to allow for valid comparisons to the ground-truth RCT. Using synthetic data, we show our algorithm indeed results in low bias when oracle estimators are evaluated on the confounded samples, which is not always the case for a previously proposed algorithm. In addition to this identification result, we highlight several finite data considerations for evaluation designers who plan to use RCT rejection sampling on their own datasets. As a proof of concept, we implement an example evaluation pipeline and walk through these finite data considerations with a novel, real-world RCT -- which we release publicly -- consisting of approximately 70k observations and text data as high-dimensional covariates. Together, these contributions build towards a broader agenda of improved empirical evaluation for causal estimation.Comment: Code and data at https://github.com/kakeith/rct_rejection_samplin
    • …
    corecore